{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Box 3D Image Transform\n",
    "This notebook is intended to demonstrate the differences of the different coordinate systems used for 3D Boxes.\n",
    "In general, 4 different coordinate systems are used with 3 of them are described in https://github.com/mcordts/cityscapesScripts/blob/master/docs/csCalibration.pdf\n",
    "1. The vehicle coordinate system *V* according to ISO 8855 with the origin on the ground below of the rear axis center, *x* pointing in driving direction, *y* pointing left, and *z* pointing up.\n",
    "2. The camera coordinate system *C* with the origin in the camera’s optical center and same orientation as *V*.\n",
    "3. The image coordinate system *I* with the origin in the top-left image pixel, *u* pointing right, and *v* pointing down.\n",
    "4. In addition, we also add the coordinate system *S* with the same origin as *C*, but the orientation of *I*, ie. *x* pointing right, *y* down, and *z* in the driving direction.\n",
    "\n",
    "All GT annotations are given in the ISO coordinate system *V* and hence, the evaluation requires the data to be available in this coordinate system.\n",
    "\n",
    "In this notebook, the transformations between all these coordinate frames are described exemplarily by loading a 3D box annotation and calculate the projection into 2D image, ie. coordinate system *I*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Sample annotation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sample_annotation = {\n",
    "    \"imgWidth\": 2048,\n",
    "    \"imgHeight\": 1024,\n",
    "    \"sensor\": {\n",
    "        \"sensor_T_ISO_8855\": [\n",
    "            [\n",
    "                0.9990881051503779,\n",
    "                -0.01948468779721943,\n",
    "                -0.03799085532693703,\n",
    "                -1.6501524664770573\n",
    "            ],\n",
    "            [\n",
    "                0.019498764210995674,\n",
    "                0.9998098810245096,\n",
    "                0.0,\n",
    "                -0.1331288872611436\n",
    "            ],\n",
    "            [\n",
    "                0.03798363254444427,\n",
    "                -0.0007407747301939942,\n",
    "                0.9992780868764849,\n",
    "                -1.2836173638418473\n",
    "            ]\n",
    "        ],\n",
    "        \"fx\": 2262.52,\n",
    "        \"fy\": 2265.3017905988554,\n",
    "        \"u0\": 1096.98,\n",
    "        \"v0\": 513.137,\n",
    "        \"baseline\": 0.209313\n",
    "    },\n",
    "    \"objects\": [\n",
    "        {\n",
    "            \"2d\": {\n",
    "                \"modal\": [\n",
    "                    609,\n",
    "                    420,\n",
    "                    198,\n",
    "                    111\n",
    "                ],\n",
    "                \"amodal\": [\n",
    "                    602,\n",
    "                    415,\n",
    "                    214,\n",
    "                    118\n",
    "                ]\n",
    "            },\n",
    "            \"3d\": {\n",
    "                \"center\": [\n",
    "                    33.95,\n",
    "                    5.05,\n",
    "                    0.57\n",
    "                ],\n",
    "                \"dimensions\": [\n",
    "                    4.3,\n",
    "                    1.72,\n",
    "                    1.53\n",
    "                ],\n",
    "                \"rotation\": [\n",
    "                    0.9735839424380041,\n",
    "                    -0.010751769161021867,\n",
    "                    0.0027191710555974913,\n",
    "                    0.22805988817753894\n",
    "                ],\n",
    "                \"type\": \"Mid Size Car\",\n",
    "                \"format\": \"CRS_ISO8855\"\n",
    "            },\n",
    "            \"occlusion\": 0.0,\n",
    "            \"truncation\": 0.0,\n",
    "            \"instanceId\": 26010,\n",
    "            \"label\": \"car\",\n",
    "            \"score\": 1.0\n",
    "        }\n",
    "    ]\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Python imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from cityscapesscripts.helpers.annotation import CsBbox3d\n",
    "from cityscapesscripts.helpers.box3dImageTransform import (\n",
    "    Camera, \n",
    "    Box3dImageTransform,\n",
    "    CRS_V,\n",
    "    CRS_C,\n",
    "    CRS_S\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create the camera\n",
    "``sensor_T_ISO_8855`` is the transformation matrix from coordinate system *V* to *C*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "camera = Camera(fx=sample_annotation[\"sensor\"][\"fx\"],\n",
    "                fy=sample_annotation[\"sensor\"][\"fy\"],\n",
    "                u0=sample_annotation[\"sensor\"][\"u0\"],\n",
    "                v0=sample_annotation[\"sensor\"][\"v0\"],\n",
    "                sensor_T_ISO_8855=sample_annotation[\"sensor\"][\"sensor_T_ISO_8855\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load the annotation\n",
    "As the annotation is given in coordinate system *V*, it must be transformed from *V* &#8594; *C* &#8594; *S* &#8594; *I*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create the Box3dImageTransform object\n",
    "box3d_annotation = Box3dImageTransform(camera=camera)\n",
    "\n",
    "# Create a CsBox3d object for the 3D annotation\n",
    "obj = CsBbox3d()\n",
    "obj.fromJsonText(sample_annotation[\"objects\"][0])\n",
    "\n",
    "# Initialize the 3D box with an annotation in coordinate system V. \n",
    "# You can alternatively pass CRS_S or CRS_C if you want to initalize the box in a different coordinate system.\n",
    "# Please note that the object's size is always given as [L, W, H] independently of the used coodrinate system.\n",
    "box3d_annotation.initialize_box_from_annotation(obj, coordinate_system=CRS_V)\n",
    "size_V, center_V, rotation_V = box3d_annotation.get_parameters(coordinate_system=CRS_V)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Print coordinates of cuboid vertices"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get the vertices of the 3D box in the requested coordinate frame\n",
    "box_vertices_V = box3d_annotation.get_vertices(coordinate_system=CRS_V)\n",
    "box_vertices_C = box3d_annotation.get_vertices(coordinate_system=CRS_C)\n",
    "box_vertices_S = box3d_annotation.get_vertices(coordinate_system=CRS_S)\n",
    "\n",
    "# Print the vertices of the box.\n",
    "# loc is encoded with a 3-char code\n",
    "#   0: B/F: Back or Front\n",
    "#   1: L/R: Left or Right\n",
    "#   2: B/T: Bottom or Top\n",
    "# BLT -> Back left top of the object\n",
    "\n",
    "# Print in V coordinate system\n",
    "print(\"Vertices in V:\")\n",
    "print(\"     {:>8} {:>8} {:>8}\".format(\"x[m]\", \"y[m]\", \"z[m]\"))\n",
    "for loc, coord in box_vertices_V.items():\n",
    "    print(\"{}: {:8.2f} {:8.2f} {:8.2f}\".format(loc, coord[0], coord[1], coord[2]))\n",
    "    \n",
    "# Print in C coordinate system\n",
    "print(\"\\nVertices in C:\")\n",
    "print(\"     {:>8} {:>8} {:>8}\".format(\"x[m]\", \"y[m]\", \"z[m]\"))\n",
    "for loc, coord in box_vertices_C.items():\n",
    "    print(\"{}: {:8.2f} {:8.2f} {:8.2f}\".format(loc, coord[0], coord[1], coord[2]))\n",
    "    \n",
    "# Print in S coordinate system\n",
    "print(\"\\nVertices in S:\")\n",
    "print(\"     {:>8} {:>8} {:>8}\".format(\"x[m]\", \"y[m]\", \"z[m]\"))\n",
    "for loc, coord in box_vertices_S.items():\n",
    "    print(\"{}: {:8.2f} {:8.2f} {:8.2f}\".format(loc, coord[0], coord[1], coord[2]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Print box parameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Similar to the box vertices, you can retrieve box parameters center, size and rotation in any coordinate system\n",
    "size_V, center_V, rotation_V = box3d_annotation.get_parameters(coordinate_system=CRS_V)\n",
    "# size_C, center_C, rotation_C = box3d_annotation.get_parameters(coordinate_system=CRS_C)\n",
    "# size_S, center_S, rotation_S = box3d_annotation.get_parameters(coordinate_system=CRS_S)\n",
    "\n",
    "print(\"Size:    \", size_V)\n",
    "print(\"Center:  \", center_V)\n",
    "print(\"Rotation:\", rotation_V)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Get 2D image coordinates"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get the vertices of the 3D box in the image coordinates\n",
    "box_vertices_I = box3d_annotation.get_vertices_2d()\n",
    "\n",
    "# Print the vertices of the box.\n",
    "# loc is encoded with a 3-char code\n",
    "#   0: B/F: Back or Front\n",
    "#   1: L/R: Left or Right\n",
    "#   2: B/T: Bottom or Top\n",
    "# BLT -> Back left top of the object\n",
    "\n",
    "print(\"\\n     {:>8} {:>8}\".format(\"u[px]\", \"v[px]\"))\n",
    "for loc, coord in box_vertices_I.items():\n",
    "    print(\"{}: {:8.2f} {:8.2f}\".format(loc, coord[0], coord[1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exemplarily generate amodal 2D bounding box"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# generate amodal 2D box from these values\n",
    "xmin = int(min([p[0] for p in box_vertices_I.values()]))\n",
    "ymin = int(min([p[1] for p in box_vertices_I.values()]))\n",
    "xmax = int(max([p[0] for p in box_vertices_I.values()]))\n",
    "ymax = int(max([p[1] for p in box_vertices_I.values()]))\n",
    "\n",
    "bbox_amodal = [xmin, ymin, xmax, ymax]\n",
    "\n",
    "print(\"Amodal 2D bounding box\")\n",
    "print(bbox_amodal)\n",
    "# load from CsBbox3d object, these 2 bounding boxes should be the same\n",
    "print(obj.bbox_2d.bbox_amodal)\n",
    "\n",
    "assert bbox_amodal == obj.bbox_2d.bbox_amodal"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Check for cycle consistency\n",
    "A box initialized in *V* and converted to *S* and *C* and back need to give the initial values. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Initialize box in V\n",
    "box3d_annotation.initialize_box(size=sample_annotation[\"objects\"][0][\"3d\"][\"dimensions\"],\n",
    "                              quaternion=sample_annotation[\"objects\"][0][\"3d\"][\"rotation\"],\n",
    "                              center=sample_annotation[\"objects\"][0][\"3d\"][\"center\"],\n",
    "                              coordinate_system=CRS_V)\n",
    "size_VV, center_VV, rotation_VV = box3d_annotation.get_parameters(coordinate_system=CRS_V)\n",
    "\n",
    "# Retrieve parameters in C, initialize in C and retrieve in V\n",
    "size_C, center_C, rotation_C = box3d_annotation.get_parameters(coordinate_system=CRS_C)\n",
    "box3d_annotation.initialize_box(size=size_C,\n",
    "                              quaternion=rotation_C,\n",
    "                              center=center_C,\n",
    "                              coordinate_system=CRS_C)\n",
    "size_VC, center_VC, rotation_VC = box3d_annotation.get_parameters(coordinate_system=CRS_V)\n",
    "\n",
    "# Retrieve parameters in S, initialize in S and retrieve in V\n",
    "size_S, center_S, rotation_S = box3d_annotation.get_parameters(coordinate_system=CRS_S)\n",
    "box3d_annotation.initialize_box(size=size_S,\n",
    "                              quaternion=rotation_S,\n",
    "                              center=center_S,\n",
    "                              coordinate_system=CRS_S)\n",
    "size_VS, center_VS, rotation_VS = box3d_annotation.get_parameters(coordinate_system=CRS_V)\n",
    "\n",
    "assert np.isclose(size_VV, size_VC).all() and np.isclose(size_VV, size_VS).all()\n",
    "assert np.isclose(center_VV, center_VC).all() and np.isclose(center_VV, center_VS).all()\n",
    "assert (rotation_VV == rotation_VC) and (rotation_VV == rotation_VS)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
